- Finish off
Rstuff! - Discussion about Mediation and Moderation!
- Baron and Kenny!
- Rucker, Preacher, Tormala, and Petty!
- Rohrer!
- Spencer, Zanna, and Fong!
- Mediation/Moderation in
R!
2022/03/18
R stuff!R!<date>)<time>)<dttm>).For instance:
library(lubridate) # not loaded with tidyverse today() # a <date>
## [1] "2022-03-17"
now() # a <dttm>
## [1] "2022-03-17 20:34:19 PDT"
lubridateMany functions for converting vectors:
ymd("2017-01-31")
## [1] "2017-01-31"
mdy("January 31st, 2017")
## [1] "2017-01-31"
dmy("31-Jan-2017")
## [1] "2017-01-31"
Also of interest: as_date(), make_date(), as.duration()
R to generate “design matrices” for modeling functionsmy_mod <- lm(y ~ x, data = dat)
The formula interface defines the columns to be used in the design matrix (as well as the rows, if a subset argument is also passed to lm()!).
This interface is used by a wide variety of packages and purposes!
~): In a two-sided formula, separates the response variables (left-side) from the predictor variables (right-side); can also be used to invoke a one-sided formula.+): Used to add main effects for additional variables (e.g., y ~ x1 + x2).-): Used to remove terms from a model.:): Used to incorporate a particular interaction.*): Used to incorporate a “crossing” (the highest order interaction between the terms and all lower-order forms).%in%: Sometimes used to declare nesting between variables.# Cross-Tables (rows by columns) g1 ~ g2 # Independent Samples t-Test y ~ group # Equivalent Interactions: y ~ x1 * x2 y ~ x1 + x2 + x1:x2 # Polynomial Regression: y ~ x + I(x^2) + I(x^3) # Factorial ANOVA: y ~ (a*b*c)^2
R packages for mediation analyses:BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 20015)bmem (Zhang & Wang, 2011)mediation (Tingley, Yamamoto, Hirose, Keele, & Imai, 2014)powerMediation (Qui, 2015)RMediation (Tofighi & MacKinnon, 2010)Functions within other packages:
mediate() in psych package (Revelle, 2012)mediation() in MBESS package (Kelley & Lai, 2012)The following example data (epi.bfi) comes from the psych package (Revelle, 2012). Does the relation between neuroticism (X) and depression (Y) depend on state-anxiety (Z)?
bdi = Beck Depression InventoryepiNeur = Neuroticism from Eysenck Personality Inventorystateanx = state anxiety# install.packages("psychTools")
library(psychTools)
data(epi.bfi)
head(epi.bfi, n = 4)
## epiE epiS epiImp epilie epiNeur bfagree bfcon bfext bfneur bfopen bdi ## 1 18 10 7 3 9 138 96 141 51 138 1 ## 2 16 8 5 1 12 101 99 107 116 132 7 ## 3 6 1 3 2 5 143 118 38 68 90 4 ## 4 12 6 4 3 15 104 106 64 114 101 8 ## traitanx stateanx ## 1 24 22 ## 2 41 40 ## 3 37 44 ## 4 54 40
# Using lm() for the linear model formulation example1 <- lm(bdi ~ stateanx*epiNeur, data=epi.bfi)
summary(example1)
## ## Call: ## lm(formula = bdi ~ stateanx * epiNeur, data = epi.bfi) ## ## Residuals: ## Min 1Q Median 3Q Max ## -12.0493 -2.2513 -0.4707 2.1135 11.9949 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.06367 2.18559 0.029 0.9768 ## stateanx 0.03750 0.06062 0.619 0.5368 ## epiNeur -0.14765 0.18869 -0.782 0.4347 ## stateanx:epiNeur 0.01528 0.00466 3.279 0.0012 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 4.12 on 227 degrees of freedom ## Multiple R-squared: 0.4978, Adjusted R-squared: 0.4912 ## F-statistic: 75.02 on 3 and 227 DF, p-value: < 2.2e-16
Note! We have a significant interaction. We may want to probe this interaction by creating a simple slopes plot.
The rockchalk package (Johnson, 2015) offers a variety of helpful functions to generate simple slopes plots, test simple slopes, and generate regression tables.
# install.packages("rockchalk")
library(rockchalk)
The plotSlopes() function produces a simple slopes plot.
We have to specify our model, followed by the variable to be plotted on the x-axis, followed by the other variable in the interaction (our “moderator”), followed by the values at which we want to set the moderator.
R tip: Remember that predictor names need to be included in quotation marks!
plotSlopes(example1, plotx="epiNeur", modx="stateanx",
modxVals="std.dev.", main = "Simple Slopes Plot Example")
Instead of using standard deviations (through the “std.dev.” option in modxVals), we can specify “quantile” or our own values. Since the variable state anxiety ranges from 21-79, perhaps we are interested in the slopes at 30, 40, 50, 60, and 70:
plotSlopes(example1, plotx="epiNeur", modx = "stateanx",
modxVals = c(30,40,50,60,70), main = "Simple Slopes Plot Example")
If you want to add 95% CIs around the slopes, use plotCurves():
plotCurves(example1, plotx="epiNeur", modx = "stateanx", modxVals ="std.dev.",
interval = "confidence", main = "Simple Slopes Plot Example")
plotCurves(example1, plotx="epiNeur", modx="stateanx", modxVals = c(30,40,50,60,70),
interval="confidence", main = "Simple Slopes Plot Example")
Data are from Pollack, VanEpps, & Hayes (2012; also found in Hayes, 2013).
Does economic stress (X) lead to a desire to withdraw from small business (Y), as a result of negative affect (M)?
estress (1-7 Likert scale)affect (1-5 Likert scale)withdraw (1-7 Likert scale)estressData <- read.table("estress.csv", sep = ",", header=TRUE)
# install.packages("MBESS")
library(MBESS)
results2 <- mediation(x = estressData$estress,
mediator = estressData$affect,
dv = estressData$withdraw,
bootstrap = TRUE, B = 10000,
which.boot = "BCa")
## [1] "Bootstrap resampling has begun. This process may take a considerable amount of time if the number of replications is large, which is optimal for the bootstrap procedure."
# Note: This conducts 10,000 replications to calculate bias-corrected and # accelerated bootstrap confidence intervals
This results2 object contains:
round(results2,3)
## Estimate CI.Lower_BCa CI.Upper_BCa ## Indirect.Effect 0.133 0.078 0.213 ## Indirect.Effect.Partially.Standardized 0.107 0.059 0.165 ## Index.of.Mediation 0.152 0.085 0.235 ## R2_4.5 -0.003 -0.035 0.037 ## R2_4.6 0.020 0.007 0.046 ## R2_4.7 0.113 0.018 0.193 ## Ratio.of.Indirect.to.Total.Effect 2.369 -1.000 305.891 ## Ratio.of.Indirect.to.Direct.Effect -1.730 -46.761 3.414 ## Success.of.Surrogate.Endpoint 0.325 -0.816 0.853 ## Residual.Based_Gamma 0.024 -0.001 0.055 ## Residual.Based.Standardized_gamma 0.037 0.011 0.071 ## SOS -0.658 -2037.209 0.999
We could ask for first row of the Bootstrap.Results to see the estimate of the indirect effect:
round(results2[1,], 3)
## Estimate CI.Lower_BCa CI.Upper_BCa ## 0.133 0.078 0.213
Suppose you have 3 predictors (X1,X2, & X3) for 2 mediators (M1 & M2) that predict one outcome (Y).Data from Mplus version 7 (Muthen & Muthen, 2012) user’s guide.
ex3 <- read.table("ex2data.txt", header = TRUE)
This model requires a latent variable modeling package to run. Typical options are lavaan, sem, or open-MX.
#install.packages("lavaan")
library(lavaan)
example3 <- ' ## define regressions
m1 ~ a1*x1 + x2 + x3
m2 ~ a2*x1 + x2 + x3
y1 ~ b1*m1 + b2*m2 + x2
## define indirect effects
ind1 := a1 * b1
ind2 := a2 * b2
totalind := ind1 + ind2
## correlated residual variances
m1 ~~ m2 '
# Use sem() to analyze the model results3 <- sem(example3, data = ex3, meanstructure = TRUE, se = "boot", bootstrap = 1000) # The model will take a few minutes to run because we have requested 1000 replications.
# Use summary() to view results summary(results3, standardized=TRUE)
## lavaan 0.6-9 ended normally after 39 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 16 ## ## Number of observations 500 ## ## Model Test User Model: ## ## Test statistic 706.322 ## Degrees of freedom 2 ## P-value (Chi-square) 0.000 ## ## Parameter Estimates: ## ## Standard errors Bootstrap ## Number of requested bootstrap draws 1000 ## Number of successful bootstrap draws 1000 ## ## Regressions: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## m1 ~ ## x1 (a1) 2.935 0.050 58.869 0.000 2.935 0.759 ## x2 1.992 0.050 39.522 0.000 1.992 0.497 ## x3 1.023 0.049 20.892 0.000 1.023 0.256 ## m2 ~ ## x1 (a2) 2.672 0.067 40.100 0.000 2.672 0.507 ## x2 3.544 0.073 48.361 0.000 3.544 0.649 ## x3 2.318 0.075 31.099 0.000 2.318 0.426 ## y1 ~ ## m1 (b1) -0.613 0.053 -11.637 0.000 -0.613 -0.607 ## m2 (b2) 1.116 0.048 23.458 0.000 1.116 1.506 ## x2 -0.642 0.123 -5.206 0.000 -0.642 -0.159 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .m1 ~~ ## .m2 1.104 0.089 12.369 0.000 1.104 0.552 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .m1 -0.042 0.052 -0.799 0.424 -0.042 -0.010 ## .m2 0.499 0.078 6.425 0.000 0.499 0.089 ## .y1 -1.665 0.091 -18.275 0.000 -1.665 -0.399 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .m1 1.408 0.082 17.240 0.000 1.408 0.082 ## .m2 2.842 0.173 16.432 0.000 2.842 0.089 ## .y1 3.781 0.241 15.662 0.000 3.781 0.217 ## ## Defined Parameters: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## ind1 -1.798 0.161 -11.154 0.000 -1.798 -0.460 ## ind2 2.981 0.149 20.051 0.000 2.981 0.763 ## totalind 1.183 0.065 18.152 0.000 1.183 0.303
parameterEstimates(results3, output = "text")
## ## Regressions: ## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper ## m1 ~ ## x1 (a1) 2.935 0.050 58.869 0.000 2.839 3.032 ## x2 1.992 0.050 39.522 0.000 1.895 2.091 ## x3 1.023 0.049 20.892 0.000 0.929 1.121 ## m2 ~ ## x1 (a2) 2.672 0.067 40.100 0.000 2.548 2.809 ## x2 3.544 0.073 48.361 0.000 3.393 3.683 ## x3 2.318 0.075 31.099 0.000 2.179 2.470 ## y1 ~ ## m1 (b1) -0.613 0.053 -11.637 0.000 -0.719 -0.514 ## m2 (b2) 1.116 0.048 23.458 0.000 1.021 1.207 ## x2 -0.642 0.123 -5.206 0.000 -0.892 -0.390 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper ## .m1 ~~ ## .m2 1.104 0.089 12.369 0.000 0.914 1.274 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper ## .m1 -0.042 0.052 -0.799 0.424 -0.145 0.067 ## .m2 0.499 0.078 6.425 0.000 0.346 0.653 ## .y1 -1.665 0.091 -18.275 0.000 -1.848 -1.490 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper ## .m1 1.408 0.082 17.240 0.000 1.242 1.566 ## .m2 2.842 0.173 16.432 0.000 2.491 3.159 ## .y1 3.781 0.241 15.662 0.000 3.318 4.239 ## ## Defined Parameters: ## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper ## ind1 -1.798 0.161 -11.154 0.000 -2.116 -1.492 ## ind2 2.981 0.149 20.051 0.000 2.705 3.276 ## totalind 1.183 0.065 18.152 0.000 1.060 1.322
fitMeasures(results3)
## npar fmin chisq df ## 16.000 0.706 706.322 2.000 ## pvalue baseline.chisq baseline.df baseline.pvalue ## 0.000 4107.449 12.000 0.000 ## cfi tli nnfi rfi ## 0.828 -0.032 -0.032 1.000 ## nfi pnfi ifi rni ## 0.828 0.138 0.828 0.828 ## logl unrestricted.logl aic bic ## -2716.785 -2363.623 5465.569 5533.003 ## ntotal bic2 rmsea rmsea.ci.lower ## 500.000 5482.218 0.839 0.788 ## rmsea.ci.upper rmsea.pvalue rmr rmr_nomean ## 0.892 0.000 0.233 0.264 ## srmr srmr_bentler srmr_bentler_nomean crmr ## 0.054 0.054 0.061 0.061 ## crmr_nomean srmr_mplus srmr_mplus_nomean cn_05 ## 0.072 0.054 0.061 5.241 ## cn_01 gfi agfi pgfi ## 7.520 0.992 0.890 0.073 ## mfi ecvi ## 0.494 1.477
fitted(results3)
## $cov ## m1 m2 y1 x1 x2 x3 ## m1 17.103 ## m2 20.989 31.757 ## y1 11.437 19.935 17.433 ## x1 3.373 3.059 1.321 1.143 ## x2 2.335 4.104 2.463 0.039 1.066 ## x3 1.119 2.673 2.236 -0.058 0.096 1.074 ## ## $mean ## m1 m2 y1 x1 x2 x3 ## 0.027 0.499 -1.108 0.046 -0.027 -0.012
resid(results3)
## $type ## [1] "raw" ## ## $cov ## m1 m2 y1 x1 x2 x3 ## m1 0.000 ## m2 0.000 0.000 ## y1 0.000 0.000 0.000 ## x1 0.000 0.000 -0.285 0.000 ## x2 0.000 0.000 0.000 0.000 0.000 ## x3 0.000 0.000 1.177 0.000 0.000 0.000 ## ## $mean ## m1 m2 y1 x1 x2 x3 ## 0 0 0 0 0 0
Readings on “Latent Variable Modelling and Meta-Analysis”: